Skip to content

Conversation

@chilo-ms
Copy link
Contributor

@chilo-ms chilo-ms commented Oct 29, 2025

Description

For TRT EP's GetCapability(), in some case, the GetSubGraph() won't add graph's output to the ComputeCapability/IndexedSubGraph returning to ORT.

The issue if from following code:

...
if (node->GetOutputEdgesCount() > node->OutputDefs().size()) {
 ... // execute here
} else {
  ...
          if (graph_output_names.find(output->Name()) != graph_output_names.end()) {
            graph_outputs_to_add[output] = output_order; // missing this
          }
}

Update TRT RTX EP as well.

Motivation and Context

#25373

@chilo-ms chilo-ms marked this pull request as ready for review October 30, 2025 21:18
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can commit the suggested changes from lintrunner.

@chilo-ms chilo-ms changed the title [TensorRT EP] Fix bug for missing outputs in the returning ComputeCapability/IndexedSubGraph [TRT/TRT RTX EP] Fix bug for missing outputs in the returning ComputeCapability/IndexedSubGraph Oct 31, 2025
@gcunhase
Copy link

@chilo-ms any more AIs needed for this to be merged? Thanks.

graph_outputs_to_add[output] = output_order;
// This output is the graph's output.
// So the output should be put into the subgraph's output list.
graph_outputs_to_add.insert({output, output_order});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

graph_outputs_to_add.insert({output, output_order});

It would not insert as before if the entry existed

Copy link
Contributor Author

@chilo-ms chilo-ms Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the key already exists in any of the those maps, i.e. fused_inputs,
fused_outputs, fused_outputs_to_add and graph_outputs_to_add, it's not necessarily to override it.

input_order/output_order is simply a relative ordering that associated with the input/output, so that when constructing final sub_graph's input of output lists from above maps, input/output with smaller order index will appear before those with larger order indices.

So exact order index is not necessary, as long as the order index for an output that should appear before another output has smaller order index is sufficient.

BTW, i added the comments for input_order/output_order to explain the usage of them.

if (graph.GetGraph().GetConsumerNodes(output->Name()).size() > 0) {
fused_outputs[output] = output_order++;
}
for (const auto& output : node->OutputDefs()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for (const auto& output : node->OutputDefs()) {

General comment for both inputs and outputs.
Should we check at some point if an optional input/output Exists()?

*
*/
{
Ort::Env env{ORT_LOGGING_LEVEL_WARNING, "test"};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ort::Env env{ORT_LOGGING_LEVEL_WARNING, "test"};

Just instantiate it once

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Member

@yuslepukhin yuslepukhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🕐

* |--- Mod ---> "labels"
*/
{
Ort::Env env{ORT_LOGGING_LEVEL_WARNING, "test"};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's already a global Ort::Env instance here - why do we need to define another one? the underlying OrtEnv is a singleton so this would refer to the same instance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to use global ort_env

return t;
}

OrtStatus* CreateModelWithNodeOutputNotUsed(const PathString& model_name) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for other tests that use models, we have checked in .onnx files and a script to generate them. is there a good reason to do it differently here? defining the model in a Python script (e.g., with onnxscript) could be more concise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, i added the python scripts as well as the models

@@ -0,0 +1,43 @@
import onnx

Check notice

Code scanning / CodeQL

Module is imported with 'import' and 'import from' Note test

Module 'onnx' is imported with both 'import' and 'import from'.
Module 'onnxruntime.test.onnx' is imported with both 'import' and 'import from'.

Copilot Autofix

AI about 1 hour ago

To resolve the "Module is imported with both 'import' and 'import from'" issue, remove the from onnx import TensorProto, helper statement and reference TensorProto and helper via the onnx module (that is, use onnx.TensorProto and onnx.helper). Update all usages of helper and TensorProto in the code accordingly. No additional dependencies or code structure changes are required. Only lines in onnxruntime/test/testdata/node_output_not_used.py handling imports and references to helper and TensorProto need to be changed.

Suggested changeset 1
onnxruntime/test/testdata/node_output_not_used.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/onnxruntime/test/testdata/node_output_not_used.py b/onnxruntime/test/testdata/node_output_not_used.py
--- a/onnxruntime/test/testdata/node_output_not_used.py
+++ b/onnxruntime/test/testdata/node_output_not_used.py
@@ -1,15 +1,14 @@
 import onnx
-from onnx import TensorProto, helper
 
 
 def create_model_with_node_output_not_used(model_path):
     # Create graph
-    X = helper.make_tensor_value_info("X", TensorProto.FLOAT, [3, 2])
-    W = helper.make_tensor_value_info("W", TensorProto.FLOAT, [2, 3])
-    Y = helper.make_tensor_value_info("Y", TensorProto.FLOAT, [2, 3])
+    X = onnx.helper.make_tensor_value_info("X", onnx.TensorProto.FLOAT, [3, 2])
+    W = onnx.helper.make_tensor_value_info("W", onnx.TensorProto.FLOAT, [2, 3])
+    Y = onnx.helper.make_tensor_value_info("Y", onnx.TensorProto.FLOAT, [2, 3])
 
     # Dropout node (two outputs)
-    dropout_node = helper.make_node(
+    dropout_node = onnx.helper.make_node(
         "Dropout",
         inputs=["X"],
         outputs=["dropout_out", "dropout_mask"],
@@ -17,21 +10,21 @@
     )
 
     # MatMul node
-    matmul_node = helper.make_node(
+    matmul_node = onnx.helper.make_node(
         "MatMul",
         inputs=["dropout_out", "W"],
         outputs=["Y"],
         name="MatMulNode",
     )
 
-    graph = helper.make_graph(
+    graph = onnx.helper.make_graph(
         nodes=[dropout_node, matmul_node],
         name="DropoutMatMulGraph",
         inputs=[X, W],
         outputs=[Y],
     )
 
-    model = helper.make_model(graph, opset_imports=[helper.make_operatorsetid("", 13)])
+    model = onnx.helper.make_model(graph, opset_imports=[onnx.helper.make_operatorsetid("", 13)])
 
     onnx.checker.check_model(model)
     onnx.save(model, model_path)
EOF
@@ -1,15 +1,14 @@
import onnx
from onnx import TensorProto, helper


def create_model_with_node_output_not_used(model_path):
# Create graph
X = helper.make_tensor_value_info("X", TensorProto.FLOAT, [3, 2])
W = helper.make_tensor_value_info("W", TensorProto.FLOAT, [2, 3])
Y = helper.make_tensor_value_info("Y", TensorProto.FLOAT, [2, 3])
X = onnx.helper.make_tensor_value_info("X", onnx.TensorProto.FLOAT, [3, 2])
W = onnx.helper.make_tensor_value_info("W", onnx.TensorProto.FLOAT, [2, 3])
Y = onnx.helper.make_tensor_value_info("Y", onnx.TensorProto.FLOAT, [2, 3])

# Dropout node (two outputs)
dropout_node = helper.make_node(
dropout_node = onnx.helper.make_node(
"Dropout",
inputs=["X"],
outputs=["dropout_out", "dropout_mask"],
@@ -17,21 +10,21 @@
)

# MatMul node
matmul_node = helper.make_node(
matmul_node = onnx.helper.make_node(
"MatMul",
inputs=["dropout_out", "W"],
outputs=["Y"],
name="MatMulNode",
)

graph = helper.make_graph(
graph = onnx.helper.make_graph(
nodes=[dropout_node, matmul_node],
name="DropoutMatMulGraph",
inputs=[X, W],
outputs=[Y],
)

model = helper.make_model(graph, opset_imports=[helper.make_operatorsetid("", 13)])
model = onnx.helper.make_model(graph, opset_imports=[onnx.helper.make_operatorsetid("", 13)])

onnx.checker.check_model(model)
onnx.save(model, model_path)
Copilot is powered by AI and may make mistakes. Always verify output.
Unable to commit as this autofix suggestion is now outdated
@@ -0,0 +1,78 @@
import onnx

Check notice

Code scanning / CodeQL

Module is imported with 'import' and 'import from' Note test

Module 'onnx' is imported with both 'import' and 'import from'.
Module 'onnxruntime.test.onnx' is imported with both 'import' and 'import from'.

Copilot Autofix

AI about 1 hour ago

To address the issue, remove the from onnx import TensorProto, helper import, and instead refer to TensorProto and helper via the main namespace import: onnx.TensorProto and onnx.helper. This will make all references to ONNX symbols consistently qualified, improving code clarity. Specifically:

  • Remove line 2: from onnx import TensorProto, helper.
  • Change all references to TensorProto in this file to onnx.TensorProto.
  • Change all references to helper to onnx.helper.
    No other functional changes are required. Only the one file onnxruntime/test/testdata/topk_and_multiple_graph_outputs.py needs editing.

Suggested changeset 1
onnxruntime/test/testdata/topk_and_multiple_graph_outputs.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/onnxruntime/test/testdata/topk_and_multiple_graph_outputs.py b/onnxruntime/test/testdata/topk_and_multiple_graph_outputs.py
--- a/onnxruntime/test/testdata/topk_and_multiple_graph_outputs.py
+++ b/onnxruntime/test/testdata/topk_and_multiple_graph_outputs.py
@@ -1,45 +1,44 @@
 import onnx
-from onnx import TensorProto, helper
 
 
 def create_model_with_topk_graph_output(model_path):
     # ======================
     # ---- Inputs ----
     # ======================
-    input_tensor = helper.make_tensor_value_info("input", TensorProto.FLOAT, ["N"])
+    input_tensor = onnx.helper.make_tensor_value_info("input", onnx.TensorProto.FLOAT, ["N"])
 
     # ======================
     # ---- Initializers ----
     # ======================
-    K = helper.make_tensor("K", TensorProto.INT64, dims=[1], vals=[300])
-    zero = helper.make_tensor("zero", TensorProto.INT64, dims=[], vals=[0])
-    twenty_six = helper.make_tensor("twenty_six", TensorProto.INT64, dims=[], vals=[26])
+    K = onnx.helper.make_tensor("K", onnx.TensorProto.INT64, dims=[1], vals=[300])
+    zero = onnx.helper.make_tensor("zero", onnx.TensorProto.INT64, dims=[], vals=[0])
+    twenty_six = onnx.helper.make_tensor("twenty_six", onnx.TensorProto.INT64, dims=[], vals=[26])
 
     # ======================
     # ---- Nodes ----
     # ======================
-    topk_node = helper.make_node(
+    topk_node = onnx.helper.make_node(
         "TopK",
         inputs=["input", "K"],
         outputs=["scores", "topk_indices"],
         name="TopK",
     )
 
-    less_node = helper.make_node(
+    less_node = onnx.helper.make_node(
         "Less",
         inputs=["topk_indices", "zero"],
         outputs=["Less_output_0"],
         name="Less",
     )
 
-    div_node = helper.make_node(
+    div_node = onnx.helper.make_node(
         "Div",
         inputs=["topk_indices", "twenty_six"],
         outputs=["Div_17_output_0"],
         name="Div",
     )
 
-    mod_node = helper.make_node(
+    mod_node = onnx.helper.make_node(
         "Mod",
         inputs=["topk_indices", "twenty_six"],
         outputs=["labels"],
@@ -49,15 +16,15 @@
     # =========================
     # ---- Graph Outputs ----
     # =========================
-    scores_out = helper.make_tensor_value_info("scores", TensorProto.FLOAT, ["K"])
-    less_out = helper.make_tensor_value_info("Less_output_0", TensorProto.BOOL, ["K"])
-    div_out = helper.make_tensor_value_info("Div_17_output_0", TensorProto.INT64, ["K"])
-    labels_out = helper.make_tensor_value_info("labels", TensorProto.INT64, ["K"])
+    scores_out = onnx.helper.make_tensor_value_info("scores", onnx.TensorProto.FLOAT, ["K"])
+    less_out = onnx.helper.make_tensor_value_info("Less_output_0", onnx.TensorProto.BOOL, ["K"])
+    div_out = onnx.helper.make_tensor_value_info("Div_17_output_0", onnx.TensorProto.INT64, ["K"])
+    labels_out = onnx.helper.make_tensor_value_info("labels", onnx.TensorProto.INT64, ["K"])
 
     # ======================
     # ---- Graph ----
     # ======================
-    graph = helper.make_graph(
+    graph = onnx.helper.make_graph(
         nodes=[topk_node, less_node, div_node, mod_node],
         name="TopKGraph",
         inputs=[input_tensor],
@@ -65,7 +28,7 @@
         initializer=[K, zero, twenty_six],
     )
 
-    model = helper.make_model(graph, opset_imports=[helper.make_operatorsetid("", 13)])
+    model = onnx.helper.make_model(graph, opset_imports=[onnx.helper.make_operatorsetid("", 13)])
 
     # Validate + Save
     onnx.checker.check_model(model)
EOF
Copilot is powered by AI and may make mistakes. Always verify output.
Unable to commit as this autofix suggestion is now outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants